监视网络流量数据以检测异常的任何隐藏模式是一个具有挑战性和耗时的任务,需要高计算资源。为此,适当的摘要技术非常重要,在那里它可以是原始数据的替代品。但是,总结数据受到异常的威胁。因此,创建一个可以将与原始数据相同的模式的摘要至关重要。因此,在本文中,我们提出了一种智能摘要方法,用于识别隐藏的异常,称为innident。建议的方法保证了将原始数据分布保持在总结数据。我们的方法是一种基于聚类的算法,它通过每个群集中的本地加权功能动态地将原始要素空间映射到新的特征空间。因此,在新的特征空间中,类似的样本更近,因此,异常值更为可检测。此外,基于簇大小的选择代表与总结数据中的原始数据保持相同的分发。在执行异常检测算法和异常检测算法之前,可以使用载体作为预处理方法。基准数据集的实验结果证明了数据的摘要可以是异常检测任务中的原始数据的替代品。
translated by 谷歌翻译
有效地探索巨大的数据,以做出决定,类似于回答复杂的问题,是挑战许多现实世界应用场景。在这种情况下,自动摘要具有重要的重要性,因为它将为大数据分析提供基础。传统的摘要方法优化系统以产生短暂的静态摘要,适合所有不考虑概述主观性方面的用户,即对不同用户认为有价值的用户,使这些方法在现实世界使用情况下不切实际。本文提出了一种基于互动概念的摘要模型,称为自适应摘要,可帮助用户制作所需的摘要,而不是产生单一的不灵活的摘要。系统通过在迭代循环中提供反馈来逐渐从用户提供信息,同时与系统交互。用户可以选择拒绝或接受概述中包含概念的操作,以从用户的透视和反馈的置信界面的重要性。所提出的方法可以保证交互式速度,以防止用户从事该过程。此外,它消除了对参考摘要的需求,这对于总结任务来说是一个具有挑战性的问题。评估表明,自适应摘要可帮助用户通过最大化所生成的摘要中的用户期望的内容来基于它们的偏好来使高质量的摘要。
translated by 谷歌翻译
From smoothly pursuing moving objects to rapidly shifting gazes during visual search, humans employ a wide variety of eye movement strategies in different contexts. While eye movements provide a rich window into mental processes, building generative models of eye movements is notoriously difficult, and to date the computational objectives guiding eye movements remain largely a mystery. In this work, we tackled these problems in the context of a canonical spatial planning task, maze-solving. We collected eye movement data from human subjects and built deep generative models of eye movements using a novel differentiable architecture for gaze fixations and gaze shifts. We found that human eye movements are best predicted by a model that is optimized not to perform the task as efficiently as possible but instead to run an internal simulation of an object traversing the maze. This not only provides a generative model of eye movements in this task but also suggests a computational theory for how humans solve the task, namely that humans use mental simulation.
translated by 谷歌翻译
The outburst of COVID-19 in late 2019 was the start of a health crisis that shook the world and took millions of lives in the ensuing years. Many governments and health officials failed to arrest the rapid circulation of infection in their communities. The long incubation period and the large proportion of asymptomatic cases made COVID-19 particularly elusive to track. However, wastewater monitoring soon became a promising data source in addition to conventional indicators such as confirmed daily cases, hospitalizations, and deaths. Despite the consensus on the effectiveness of wastewater viral load data, there is a lack of methodological approaches that leverage viral load to improve COVID-19 forecasting. This paper proposes using deep learning to automatically discover the relationship between daily confirmed cases and viral load data. We trained one Deep Temporal Convolutional Networks (DeepTCN) and one Temporal Fusion Transformer (TFT) model to build a global forecasting model. We supplement the daily confirmed cases with viral loads and other socio-economic factors as covariates to the models. Our results suggest that TFT outperforms DeepTCN and learns a better association between viral load and daily cases. We demonstrated that equipping the models with the viral load improves their forecasting performance significantly. Moreover, viral load is shown to be the second most predictive input, following the containment and health index. Our results reveal the feasibility of training a location-agnostic deep-learning model to capture the dynamics of infection diffusion when wastewater viral load data is provided.
translated by 谷歌翻译
Single-cell sequencing assay for transposase-accessible chromatin (scATAC-seq) assesses genome-wide chromatin accessibility in thousands of cells to reveal regulatory landscapes in high resolutions. However, the analysis presents challenges due to the high dimensionality and sparsity of the data. Several methods have been developed, including transformation techniques of term-frequency inverse-document frequency (TF-IDF), dimension reduction methods such as singular value decomposition (SVD), factor analysis, and autoencoders. Yet, a comprehensive study on the mentioned methods has not been fully performed. It is not clear what is the best practice when analyzing scATAC-seq data. We compared several scenarios for transformation and dimension reduction as well as the SVD-based feature analysis to investigate potential enhancements in scATAC-seq information retrieval. Additionally, we investigate if autoencoders benefit from the TF-IDF transformation. Our results reveal that the TF-IDF transformation generally leads to improved clustering and biologically relevant feature extraction.
translated by 谷歌翻译
This work concerns developing communication- and computation-efficient methods for large-scale multiple testing over networks, which is of interest to many practical applications. We take an asymptotic approach and propose two methods, proportion-matching and greedy aggregation, tailored to distributed settings. The proportion-matching method achieves the global BH performance yet only requires a one-shot communication of the (estimated) proportion of true null hypotheses as well as the number of p-values at each node. By focusing on the asymptotic optimal power, we go beyond the BH procedure by providing an explicit characterization of the asymptotic optimal solution. This leads to the greedy aggregation method that effectively approximate the optimal rejection regions at each node, while computation-efficiency comes from the greedy-type approach naturally. Extensive numerical results over a variety of challenging settings are provided to support our theoretical findings.
translated by 谷歌翻译
当网络条件恶化时,视频会议系统的用户体验差,因为当前的视频编解码器根本无法在极低的比特率下运行。最近,已经提出了几种神经替代方案,可以使用每个框架的稀疏表示,例如面部地标信息,以非常低的比特率重建说话的头视频。但是,这些方法在通话过程中具有重大运动或遮挡的情况下会产生不良的重建,并且不会扩展到更高的分辨率。我们设计了Gemino,这是一种基于新型高频条件超分辨率管道的新型神经压缩系统,用于视频会议。 Gemino根据从单个高分辨率参考图像中提取的信息来增强高频细节(例如,皮肤纹理,头发等),为每个目标框架的一个非常低分辨率的版本(例如,皮肤纹理,头发等)。我们使用多尺度体系结构,该体系结构在不同的分辨率下运行模型的不同组件,从而使其扩展到可与720p相当的分辨率,并且我们个性化模型以学习每个人的特定细节,在低比特率上实现了更好的保真度。我们在AIORTC上实施了Gemino,这是WEBRTC的开源Python实现,并表明它在A100 GPU上实时在1024x1024视频上运行,比比特率的比特率低于传统的视频Codecs,以相同的感知质量。
translated by 谷歌翻译
kronecker回归是一个高度结构的最小二乘问题$ \ min _ {\ mathbf {x}}} \ lvert \ mathbf {k} \ mathbf {x} - \ mathbf {b} \ rvert_ \ rvert_ {2}^2 $矩阵$ \ mathbf {k} = \ mathbf {a}^{(1)} \ otimes \ cdots \ cdots \ otimes \ mathbf {a}^{(n)} $是因子矩阵的Kronecker产品。这种回归问题是在广泛使用的最小二乘(ALS)算法的每个步骤中都出现的,用于计算张量的塔克分解。我们介绍了第一个用于求解Kronecker回归的子次数算法,以避免在运行时间中避免指数项$ o(\ varepsilon^{ - n})$的$(1+ \ varepsilon)$。我们的技术结合了利用分数抽样和迭代方法。通过扩展我们对一个块是Kronecker产品的块设计矩阵的方法,我们还实现了(1)Kronecker Ridge回归的亚次级时间算法,并且(2)更新ALS中Tucker分解的因子矩阵,这不是一个不是一个纯Kronecker回归问题,从而改善了Tucker ALS的所有步骤的运行时间。我们证明了该Kronecker回归算法在合成数据和现实世界图像张量上的速度和准确性。
translated by 谷歌翻译
由于交通的固有复杂性和不确定性,自主驾驶决策是一项具有挑战性的任务。例如,相邻的车辆可能随时改变其车道或超越,以通过慢速车辆或帮助交通流量。预期周围车辆的意图,估算其未来状态并将其整合到自动化车辆的决策过程中,可以提高复杂驾驶场景中自动驾驶的可靠性。本文提出了一种基于预测的深入强化学习(PDRL)决策模型,该模型在公路驾驶决策过程中考虑了周围车辆的操纵意图。该模型是使用真实流量数据训练的,并通过模拟平台在各种交通条件下进行了测试。结果表明,与深入的增强学习(DRL)模型相比,提出的PDRL模型通过减少碰撞数量来改善决策绩效,从而导致更安全的驾驶。
translated by 谷歌翻译
腹主动脉瘤(AAA)是一种血管疾病,其中主动脉的一部分肿大,削弱其壁并可能破裂血管。腹部超声已用于诊断,但由于其图像质量和操作员的依赖性有限,通常需要进行CT扫描进行监测和治疗计划。最近,腹部CT数据集已成功用于训练深神经网络以进行自动主动脉分割。因此,可以利用从这项解决的任务中收集的知识来改善我们的AAA诊断和监测分段。为此,我们提出了Cactuss:一种常见的解剖CT-US空间,它是CT和美国模式之间的虚拟桥梁,以实现自动AAA筛选超声检查。仙人掌利用公开可用的标记数据来学习基于从美国和CT继承属性的中介表示。我们在此新表示中训练分割网络,并采用附加的图像到图像翻译网络,使我们的模型能够在真实的B模式图像上执行。与完全监督的方法进行的定量比较证明了在骰子评分和诊断指标方面的能力,这表明我们的方法还满足了AAA扫描和诊断的临床要求。
translated by 谷歌翻译